{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "d8c1d5d2",
   "metadata": {},
   "source": [
    "# Hypothesis testing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "30615f97",
   "metadata": {},
   "outputs": [],
   "source": [
    "from os.path import basename, exists\n",
    "\n",
    "\n",
    "def download(url):\n",
    "    filename = basename(url)\n",
    "    if not exists(filename):\n",
    "        from urllib.request import urlretrieve\n",
    "\n",
    "        local, _ = urlretrieve(url, filename)\n",
    "        print(\"Downloaded \" + local)\n",
    "\n",
    "\n",
    "download(\"https://github.com/AllenDowney/ThinkStats2/raw/master/code/thinkstats2.py\")\n",
    "download(\"https://github.com/AllenDowney/ThinkStats2/raw/master/code/thinkplot.py\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "684b56bd",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "import random\n",
    "\n",
    "import thinkstats2\n",
    "import thinkplot"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2f4bbe46",
   "metadata": {},
   "source": [
    "## Classical hypothesis testing\n",
    "\n",
    "Exploring the data from the NSFG, we saw several \"apparent effects,\"\n",
    "including differences between first babies and others. So far we have\n",
    "taken these effects at face value; in this chapter, we put them to the\n",
    "test.\n",
    "\n",
    "The fundamental question we want to address is whether the effects we\n",
    "see in a sample are likely to appear in the larger population. For\n",
    "example, in the NSFG sample we see a difference in mean pregnancy length\n",
    "for first babies and others. We would like to know if that effect\n",
    "reflects a real difference for women in the U.S., or if it might appear\n",
    "in the sample by chance.\n",
    "\n",
    "There are several ways we could formulate this question, including\n",
    "Fisher null hypothesis testing, Neyman-Pearson decision theory, and\n",
    "Bayesian inference[^1]. What I present here is a subset of all three\n",
    "that makes up most of what people use in practice, which I will call\n",
    "**classical hypothesis testing**."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ef77dd2f",
   "metadata": {},
   "source": [
    "The goal of classical hypothesis testing is to answer the question,\n",
    "\"Given a sample and an apparent effect, what is the probability of\n",
    "seeing such an effect by chance?\" Here's how we answer that question:\n",
    "\n",
    "-   The first step is to quantify the size of the apparent effect by\n",
    "    choosing a **test statistic**. In the NSFG example, the apparent\n",
    "    effect is a difference in pregnancy length between first babies and\n",
    "    others, so a natural choice for the test statistic is the difference\n",
    "    in means between the two groups.\n",
    "\n",
    "-   The second step is to define a **null hypothesis**, which is a model\n",
    "    of the system based on the assumption that the apparent effect is\n",
    "    not real. In the NSFG example the null hypothesis is that there is\n",
    "    no difference between first babies and others; that is, that\n",
    "    pregnancy lengths for both groups have the same distribution.\n",
    "\n",
    "-   The third step is to compute a **p-value**, which is the probability\n",
    "    of seeing the apparent effect if the null hypothesis is true. In the\n",
    "    NSFG example, we would compute the actual difference in means, then\n",
    "    compute the probability of seeing a difference as big, or bigger,\n",
    "    under the null hypothesis.\n",
    "\n",
    "-   The last step is to interpret the result. If the p-value is low, the\n",
    "    effect is said to be **statistically significant**, which means that\n",
    "    it is unlikely to have occurred by chance. In that case we infer\n",
    "    that the effect is more likely to appear in the larger population."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f95af12c",
   "metadata": {},
   "source": [
    "The logic of this process is similar to a proof by contradiction. To\n",
    "prove a mathematical statement, A, you assume temporarily that A is\n",
    "false. If that assumption leads to a contradiction, you conclude that A\n",
    "must actually be true.\n",
    "\n",
    "Similarly, to test a hypothesis like, \"This effect is real,\" we assume,\n",
    "temporarily, that it is not. That's the null hypothesis. Based on that\n",
    "assumption, we compute the probability of the apparent effect. That's\n",
    "the p-value. If the p-value is low, we conclude that the null hypothesis\n",
    "is unlikely to be true."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8bc51b88",
   "metadata": {},
   "source": [
    "## HypothesisTest\n",
    "\n",
    "`thinkstats2` provides `HypothesisTest`, a class that represents the\n",
    "structure of a classical hypothesis test. Here is the definition:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "b2b0cc98",
   "metadata": {},
   "outputs": [],
   "source": [
    "class HypothesisTest(object):\n",
    "\n",
    "    def __init__(self, data):\n",
    "        self.data = data\n",
    "        self.MakeModel()\n",
    "        self.actual = self.TestStatistic(data)\n",
    "\n",
    "    def PValue(self, iters=1000):\n",
    "        self.test_stats = [self.TestStatistic(self.RunModel()) \n",
    "                           for _ in range(iters)]\n",
    "\n",
    "        count = sum(1 for x in self.test_stats if x >= self.actual)\n",
    "        return count / iters\n",
    "\n",
    "    def TestStatistic(self, data):\n",
    "        raise UnimplementedMethodException()\n",
    "\n",
    "    def MakeModel(self):\n",
    "        pass\n",
    "\n",
    "    def RunModel(self):\n",
    "        raise UnimplementedMethodException()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ff3a1e1a",
   "metadata": {},
   "source": [
    "`HypothesisTest` is an abstract parent class that provides complete\n",
    "definitions for some methods and place-keepers for others. Child classes\n",
    "based on `HypothesisTest` inherit `__init__` and `PValue` and provide\n",
    "`TestStatistic`, `RunModel`, and optionally `MakeModel`.\n",
    "\n",
    "`__init__` takes the data in whatever form is appropriate. It calls\n",
    "`MakeModel`, which builds a representation of the null hypothesis, then\n",
    "passes the data to `TestStatistic`, which computes the size of the\n",
    "effect in the sample.\n",
    "\n",
    "`PValue` computes the probability of the apparent effect under the null\n",
    "hypothesis. It takes as a parameter `iters`, which is the number of\n",
    "simulations to run. The first line generates simulated data, computes\n",
    "test statistics, and stores them in `test_stats`. The result is the\n",
    "fraction of elements in `test_stats` that exceed or equal the observed\n",
    "test statistic, `self.actual`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a4e2164a",
   "metadata": {},
   "source": [
    "As a simple example[^2], suppose we toss a coin 250 times and see 140\n",
    "heads and 110 tails. Based on this result, we might suspect that the\n",
    "coin is biased; that is, more likely to land heads. To test this\n",
    "hypothesis, we compute the probability of seeing such a difference if\n",
    "the coin is actually fair:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "3a8cc565",
   "metadata": {},
   "outputs": [],
   "source": [
    "class CoinTest(thinkstats2.HypothesisTest):\n",
    "\n",
    "    def TestStatistic(self, data):\n",
    "        heads, tails = data\n",
    "        test_stat = abs(heads - tails)\n",
    "        return test_stat\n",
    "\n",
    "    def RunModel(self):\n",
    "        heads, tails = self.data\n",
    "        n = heads + tails\n",
    "        sample = [random.choice('HT') for _ in range(n)]\n",
    "        hist = thinkstats2.Hist(sample)\n",
    "        data = hist['H'], hist['T']\n",
    "        return data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2ca35d3f",
   "metadata": {},
   "source": [
    "The parameter, `data`, is a pair of integers: the number of heads and\n",
    "tails. The test statistic is the absolute difference between them, so\n",
    "`self.actual` is 30.\n",
    "\n",
    "`RunModel` simulates coin tosses assuming that the coin is actually\n",
    "fair. It generates a sample of 250 tosses, uses Hist to count the number\n",
    "of heads and tails, and returns a pair of integers.\n",
    "\n",
    "Now all we have to do is instantiate `CoinTest` and call `PValue`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "2b20d326",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.061"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ct = CoinTest((140, 110))\n",
    "p_value = ct.PValue()\n",
    "p_value"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "97387824",
   "metadata": {},
   "source": [
    "The result is about 0.07, which means that if the coin is fair, we\n",
    "expect to see a difference as big as 30 about 7% of the time.\n",
    "\n",
    "How should we interpret this result? By convention, 5% is the threshold\n",
    "of statistical significance. If the p-value is less than 5%, the effect\n",
    "is considered significant; otherwise it is not.\n",
    "\n",
    "But the choice of 5% is arbitrary, and (as we will see later) the\n",
    "p-value depends on the choice of the test statistics and the model of\n",
    "the null hypothesis. So p-values should not be considered precise\n",
    "measurements."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f702fc14",
   "metadata": {},
   "source": [
    "I recommend interpreting p-values according to their order of magnitude:\n",
    "if the p-value is less than 1%, the effect is unlikely to be due to\n",
    "chance; if it is greater than 10%, the effect can plausibly be explained\n",
    "by chance. P-values between 1% and 10% should be considered borderline.\n",
    "So in this example I conclude that the data do not provide strong\n",
    "evidence that the coin is biased or not."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d0cd3e27",
   "metadata": {},
   "source": [
    "## Testing a difference in means\n",
    "\n",
    "One of the most common effects to test is a difference in mean between\n",
    "two groups. In the NSFG data, we saw that the mean pregnancy length for\n",
    "first babies is slightly longer, and the mean birth weight is slightly\n",
    "smaller. Now we will see if those effects are statistically significant.\n",
    "\n",
    "For these examples, the null hypothesis is that the distributions for\n",
    "the two groups are the same. One way to model the null hypothesis is by\n",
    "**permutation**; that is, we can take values for first babies and others\n",
    "and shuffle them, treating the two groups as one big group:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "fa95626f",
   "metadata": {},
   "outputs": [],
   "source": [
    "class DiffMeansPermute(thinkstats2.HypothesisTest):\n",
    "\n",
    "    def TestStatistic(self, data):\n",
    "        group1, group2 = data\n",
    "        test_stat = abs(group1.mean() - group2.mean())\n",
    "        return test_stat\n",
    "\n",
    "    def MakeModel(self):\n",
    "        group1, group2 = self.data\n",
    "        self.n, self.m = len(group1), len(group2)\n",
    "        self.pool = np.hstack((group1, group2))\n",
    "\n",
    "    def RunModel(self):\n",
    "        np.random.shuffle(self.pool)\n",
    "        data = self.pool[:self.n], self.pool[self.n:]\n",
    "        return data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "78c725f1",
   "metadata": {},
   "source": [
    "`data` is a pair of sequences, one for each group. The test statistic is\n",
    "the absolute difference in the means.\n",
    "\n",
    "`MakeModel` records the sizes of the groups, `n` and `m`, and combines\n",
    "the groups into one NumPy array, `self.pool`.\n",
    "\n",
    "`RunModel` simulates the null hypothesis by shuffling the pooled values\n",
    "and splitting them into two groups with sizes `n` and `m`. As always,\n",
    "the return value from `RunModel` has the same format as the observed\n",
    "data.\n",
    "\n",
    "To test the difference in pregnancy length, we run:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "aca382ad",
   "metadata": {},
   "outputs": [],
   "source": [
    "download(\"https://github.com/AllenDowney/ThinkStats2/raw/master/code/nsfg.py\")\n",
    "download(\"https://github.com/AllenDowney/ThinkStats2/raw/master/code/first.py\")\n",
    "download(\"https://github.com/AllenDowney/ThinkStats2/raw/master/code/2002FemPreg.dct\")\n",
    "download(\n",
    "    \"https://github.com/AllenDowney/ThinkStats2/raw/master/code/2002FemPreg.dat.gz\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "29e7d525",
   "metadata": {},
   "outputs": [],
   "source": [
    "import first\n",
    "\n",
    "live, firsts, others = first.MakeFrames()\n",
    "data = firsts.prglngth.values, others.prglngth.values\n",
    "ht = DiffMeansPermute(data)\n",
    "pvalue = ht.PValue()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "707fcc69",
   "metadata": {},
   "source": [
    "`MakeFrames` reads the NSFG data and returns DataFrames representing all\n",
    "live births, first babies, and others. We extract pregnancy lengths as\n",
    "NumPy arrays, pass them as data to `DiffMeansPermute`, and compute the\n",
    "p-value. The result is about 0.17, which means that we expect to see a\n",
    "difference as big as the observed effect about 17% of the time. So this\n",
    "effect is not statistically significant."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53e27b27",
   "metadata": {},
   "source": [
    "`HypothesisTest` provides `PlotCdf`, which plots the distribution of the\n",
    "test statistic and a gray line indicating the observed effect size:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "9927271d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAkgAAAGwCAYAAABSN5pGAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy80BEi2AAAACXBIWXMAAA9hAAAPYQGoP6dpAAA57ElEQVR4nO3deXxU1f3/8XcSsoAhCYgkgIFEZbNsyhKDCiKpLC5Q9SsFa4AqbtBiUypQkbj0S/i6IFVQlIrUn5VFv4r9IaXVAAElrIJCxQRoEKokQCkJYUnizPn94Y+pk8xknZk7y+v5eMyjcu6dmc/phdx3zj333DBjjBEAAAAcwq0uAAAAwN8QkAAAAKohIAEAAFRDQAIAAKiGgAQAAFANAQkAAKAaAhIAAEA1zawuwNfsdru+/fZbtWzZUmFhYVaXAwAA6sEYo9OnT6t9+/YKD/f++E7IBaRvv/1WycnJVpcBAAAa4ciRI7r00ku9/j0hF5Batmwp6fv/g+Pi4iyuBgAA1EdZWZmSk5Md53FvC7mAdOGyWlxcHAEJAIAA46vpMUzSBgAAqIaABAAAUA0BCQAAoBoCEgAAQDUEJAAAgGoISAAAANUQkAAAAKohIAEAAFRDQAIAAKiGgAQAAFCNpQFp48aNuvXWW9W+fXuFhYVp1apVdb5nw4YNuvrqqxUdHa0rrrhCS5cu9XqdAAAgtFj6LLYzZ86od+/e+vnPf67bb7+9zv2Liop0880368EHH9Sf/vQn5ebm6r777lO7du00bNgwH1QMhBZjjOx2u1NbeHi4z56FBIQam82u8rMVVpfhMXGxMQH788LSgDRixAiNGDGi3vsvWrRIqampev755yVJ3bt31yeffKIXXniBgAR4gd1uV1FRkVNbamqqIiIiLKoIcC0YgsXGHfu1dNVmq8vwqCW/G6/4ls2tLqNRLA1IDZWfn6+MjAyntmHDhumRRx5x+56KigpVVPznH01ZWZm3ygMANEJTw00wBgtYL6ACUnFxsRITE53aEhMTVVZWpnPnzql585opNScnR08++aSvSgQAuOEqCBFu4K8CKiA1xsyZM5WVleX4c1lZmZKTky2sCACCQ0NGfghCCDQBFZCSkpJUUlLi1FZSUqK4uDiXo0eSFB0drejoaF+UBwB+z1NzdQg83jVh9EAN6tfZ6jKaLC42xuoSGi2gAlJ6errWrFnj1PbRRx8pPT3doooAwL/9MBCFQqgJhmAR2yJaEREsU2g1SwNSeXm5Dhw44PhzUVGRdu/erdatW6tjx46aOXOmvvnmG7355puSpAcffFALFizQo48+qp///Odat26dVq5cqQ8//NCqLgCA36g+OhRogaip4YZgAU+yNCDt2LFDQ4YMcfz5wlyh8ePHa+nSpTp69KgOHz7s2J6amqoPP/xQv/rVr/T73/9el156qf7whz9wiz+AkHUhFAVSGHIVhAg38DdhxhhjdRG+VFZWpvj4eJWWliouLs7qcgC/ZrPZWAfJz/jjJbOGjPwQhNBYvj5/B9QcJAAIFo2ZLO3JQOSpuToEHgQrAhIA+IiVl8N+GIgINUDdCEgA4GU2m11rNu71WSiqPjpEIAIajoAEAF7iy2B0IRQRhgDPICABgBes31qgBW+v99rnc8kM8C4CEgA0UfUJ1xu2F+rND/Lr9d7GTJYmEAHeR0ACgEZq7CU0LocB/o+ABAC1cHc7fmPuRJsweqBGDupBKAICAAEJAOQ6CHnqdvzMUem6ZXBPghEQQAhIAEKWL9YlmjJuiIakdfXKZwPwHgISgJDkjbvMWH8ICB4EJAAhxWaza3XennrfZVYfzC0Cgg8BCUDIaMqokbvb8RklAoITAQlA0PrhxOv6rk3kKggRgoDQQ0ACEFQaM/E6c1S6bujfhSAEwIGABCAoNHbRRu4yA+AKAQlAwGvM3CLWJgJQGwISgIDV2DvSGDUCUBcCEoCA1JBRox9OvGaeEYD6ICABCBgXJmDX5440HggLoCkISAD8XkMmYDO3CIAnEJAA+LW87YV6deUmVVRW1bkvc4sAeAoBCYBfstnsKi0/pxffWlfnvowaAfA0AhIAv9LQ9YwYNQLgDQQkAH6jvnemMQEbgLcRkABYqiF3pknS8ucmKTIywgeVAQhlBCQAlmjopbQWMVG6787rCEcAfIKABMDnGrPII5fTAPgSAQmAzzTk0SDcmQbASgQkAF51YY7Rxh37uTMNQMAgIAHwGi6lAQhUBCQAXpG7ZZ9eXpZX535cSgPgjwhIADyuvuGIS2kA/BUBCYBH1SccTRg9UCMH9WDUCIDfIiAB8Ii67lDLHJWuG/p3YY4RgIBAQALQZHnbC/Xqyk2qqKxyuf3hsYM19JruPq4KABqPgASgSaqqbHrxrXVutxOOAAQiAhKARsvbXkg4AhCUCEgAGqWukSPuUAMQyAhIABqsrgUglz83iYfKAghoBCQADbJ+W4FeWb7R7fZf/uxGwhGAgEdAAlBvVd/Z9PKyPIWFhbnczsgRgGBBQAJQJ5vNrrzPivRe7l61aNGixvboqEg9cNf1hCMAQYOABKAGm82u8rMVstltWrf9oN7L3et239szrtJPR/Zn8UcAQYWABMDBZrNrzca9WrpqsyTJGKOzZ8+63T86KpJwBCAoEZAASKr7zrTqLlxWIxwBCEYEJAANDkc8bBZAsCMgASGuqspW73B0+9AemnDHjYqKivRyVQBgLQISEIIuTMLeuGO/Y76RK+NHpatjm+/DUIuYSEVEhDNqBCAkEJCAEFOfy2kX7kyTjIqKinxTGAD4EX4VBEJI7pZ9dYYj7kwDAAISEDJyt+zTy8vyat2HO9MA4HtcYgNCQH3CEXemAcB/EJCAIFdbOMocla4b+ndRbItoghEA/AABCQhitYWjh8cO1tBruvu4IgAIDPzKCASp9VsLCEcA0EgEJCAI2Wx2t3erEY4AoG6WB6SFCxcqJSVFMTExSktL07Zt22rdf/78+eratauaN2+u5ORk/epXv9L58+d9VC0QGFbn7XHZTjgCgPqxNCCtWLFCWVlZys7O1meffabevXtr2LBhOnbsmMv93377bc2YMUPZ2dnat2+fXn/9da1YsUK//e1vfVw54L9yt+zTmx/k12jPHJVOOAKAerI0IM2bN0+TJk3SxIkTdeWVV2rRokVq0aKFlixZ4nL/zZs369prr9W4ceOUkpKim266SWPHjq111KmiokJlZWVOLyBY1Tbv6JbBPX1cDQAELssCUmVlpXbu3KmMjIz/FBMeroyMDOXn1/ztV5IGDhyonTt3OgLRP/7xD61Zs0YjR450+z05OTmKj493vJKTkz3bEcBP1DbvaMq4IdzGDwANYNlt/idOnJDNZlNiYqJTe2Jior766iuX7xk3bpxOnDih6667TsYYfffdd3rwwQdrvcQ2c+ZMZWVlOf5cVlZGSELQsdnsWr5mu8ttD48drCFpXX1cEQAEtoBaB2nDhg2aM2eOXn75ZaWlpenAgQOaOnWqnn76aT3++OMu3xMdHa3o6GgfVwr4Tt72Qr26cpMqKqtqbGPeEQA0jmUBqU2bNoqIiFBJSYlTe0lJiZKSkly+5/HHH9c999yj++67T5LUs2dPnTlzRvfff78ee+wxhYdzCQGhxWazuw1HEvOOAKCxLEsUUVFR6tu3r3Jzcx1tdrtdubm5Sk9Pd/mes2fP1ghBERERkiRjjPeKBfzUmo173YYj5h0BQONZeoktKytL48ePV79+/TRgwADNnz9fZ86c0cSJEyVJmZmZ6tChg3JyciRJt956q+bNm6errrrKcYnt8ccf16233uoISkCosNnsWrpqs8ttU8YNYd4RADSBpQFpzJgxOn78uGbPnq3i4mL16dNHa9eudUzcPnz4sNOI0axZsxQWFqZZs2bpm2++0SWXXKJbb71V//3f/21VFwDLuFsMcvlzkxQZyS8MANAUYSbErk2VlZUpPj5epaWliouLs7ocoFHcPYR2wuiBunVIL499j81mU1FRkVNbamoqI7YAfM7X528mKAABprbFIEcO6uHjagAgOBGQgABSVWVjMUgA8IGAWgcJCGV52wv14lvrXG5jMUgA8Cx+3QQCwIX1jlxhMUgA8DwCEuDnLjxGxNV6R9FRkSwGCQBewCU2wI/V9hgRSXrgruuZdwQAXkBAAvxUXY8RYb0jAPAefvUE/FRdjxEhHAGA9xCQAD/EY0QAwFpcYgP8UPnZCpftXFYDAN9gBAnwQxu2F9ZomzB6IOEIAHyEgAT4mfVbC/TmB/k12gf162xBNQAQmghIgB+x2exuHyUS2yLax9UAQOgiIAF+ZHXeHpftPGcNAHyLn7iAn8jdss/lpbXMUenctQYAPkZAAvxA7pZ9enlZnsttPEoEAHyPgARYrLZwxKU1ALAG6yABFrDZ7Co/W6EN2wtdXlaTpIfHDubSGgBYhIAE+Fje9kL94d1PdPZ8pdt9Hh47WEOv6e7DqgAAP0RAAnyorgfQSoQjAPAHTG4AfMRms2v5mu2EIwAIAIwgAT6Qt72wzpEjHkILAP6DgAR4WVWVTS++tc7t9sVP3aP42ObcrQYAfoSABHhR3vbCWsPRlHFD1Dr+Ih9WBACoD35lBbzkwoRsd7ikBgD+ixEkwEtW5+1xO+do+XOTFBkZ4eOKAAD1xQgS4AXrtxa4XQDylz+7kXAEAH6OESTAw2w2uxa8vd7lNkaOACAwMIIEeNjqvD0u26eMG0I4AoAAQUACPMjdpbXMUelMyAaAAEJAAjyktktrtwzu6eNqAABNQUACPGTNxr0u26eMG8IikAAQYPipDXiAzWbX0lWba7RzaQ0AAhMBCfAAd6NHXFoDgMBEQAKayN3o0YTRA7m0BgABip/eQBO5u61/5KAePq4EAOApBCSgCXK37HN5Wz+jRwAQ2PgJDjTS+q0FenlZnsttjB4BQGAjIAGNUNuaR9zWDwCBj5/iQCOUlp9z2f7w2MHc1g8AQYCH1QINtH5rgcvRo8xR6Rp6TXcLKgIAeBojSEADuAtHknRD/y4+rgYA4C0EJKCeapt31CImSrEton1cEQDAWwhIQD25Wy07OipS9915HROzASCIMAcJqAd3q2XfnnGVfjqyP+EIAIIMP9WBenA3ekQ4AoDgxE92oA48aw0AQg8/3YE68Kw1AAg9BCSgFuu3FvCsNQAIQfyEB9yo7bZ+Ro8AILgRkAA33F1a41lrABD8+CkPuJC7ZZ/LS2uZo9J51hoAhADWQQJ+wGaza3XeHpfhSJJuGdzTxxUBAKxAQAL+v7zthXp15SZVVFa53M6lNQAIHfy0B/T9yFFt4ejhsYO5tAYAIYSABOj7Cdm1haOh13T3cUUAACtZHpAWLlyolJQUxcTEKC0tTdu2bat1/1OnTmny5Mlq166doqOj1aVLF61Zs8ZH1SIYuVvrSPr+shrhCABCj6VzkFasWKGsrCwtWrRIaWlpmj9/voYNG6aCggK1bdu2xv6VlZX68Y9/rLZt2+rdd99Vhw4d9PXXXyshIcH3xSMo1LbW0fLnJikyMsLHFQEA/IGlAWnevHmaNGmSJk6cKElatGiRPvzwQy1ZskQzZsyosf+SJUt08uRJbd68WZGRkZKklJSUWr+joqJCFRUVjj+XlZV5rgMIeLWtdUQ4AoDQZdkltsrKSu3cuVMZGRn/KSY8XBkZGcrPd325489//rPS09M1efJkJSYmqkePHpozZ45sNpvb78nJyVF8fLzjlZyc7PG+IDC5u7TGWkcAAMsC0okTJ2Sz2ZSYmOjUnpiYqOLiYpfv+cc//qF3331XNptNa9as0eOPP67nn39ev/vd79x+z8yZM1VaWup4HTlyxKP9QGCq7dIaax0BAAJqHSS73a62bdvqtddeU0REhPr27atvvvlGzz77rLKzs12+Jzo6WtHR0T6uFP5uzca9LttZ6wgAIFkYkNq0aaOIiAiVlJQ4tZeUlCgpKcnle9q1a6fIyEhFRPxnbkj37t1VXFysyspKRUVFebVmBAebza6lqzbXaOfSGgDgAst+VY6KilLfvn2Vm5vraLPb7crNzVV6errL91x77bU6cOCA7Ha7o62wsFDt2rUjHKHeSsvPuWzn0hoA4AJLryVkZWVp8eLF+uMf/6h9+/bpoYce0pkzZxx3tWVmZmrmzJmO/R966CGdPHlSU6dOVWFhoT788EPNmTNHkydPtqoLCDB52ws1afb/qdE+YfRALq0BABwsnYM0ZswYHT9+XLNnz1ZxcbH69OmjtWvXOiZuHz58WOHh/zlpJScn669//at+9atfqVevXurQoYOmTp2q6dOnW9UFBJALjxNxZVC/zj6uBgDgz8KMMcbqInyprKxM8fHxKi0tVVxcnNXlwIf+7/ovXM49ahETpaVzJjCC5ILNZlNRUZFTW2pqqtM8QADwBV+fvzkjICS4m5gtSffdeR3hCADgJKBu8wcay91t/TxOBADgCr82I+hVVdlcjh5NGD2QcAQAcIkRJAS1vO2FevGtdS63jRzUw8fVAAACBSNICFpVVTa34Yjb+gEAtWEECUGptpGj6KhIRo8AALXiV2gEHZvNrj+8+4nb7Q/cdT2jRwCAWjGChKBTfrZCZ89XutzGXWsAgPrg12gEnY079rts/+XPbiQcAQDqhREkBBV3C0IufuoetY6/yIKKAACBiBEkBBV3C0LGxzb3cSUAgEBGQELQcDd6xC39AICG4qyBoFF+tsJlO7f0AwAaioCEoMboEQCgMThzIGhs2F5Yo21Qv84WVAIACHQNCkiZmZk6ffq048+ff/65qqqqPF4U0FC5W/bpzQ/yrS4DABAkGhSQ/vSnP+ncuXOOP19//fU6cuSIx4sCGiJ3yz69vCzP5bbYFtE+rgYAEAwaFJCMMbX+GfC12sLRlHFDmH8EAGgUzh4IWLWFo4fHDtaQtK4+rggAECwavJL2l19+qeLiYknfjyB99dVXKi8vd9qnV69enqkOcKOucDT0mu4+rggAEEwaHJCGDh3qdGntlltukSSFhYXJGKOwsDDZbDbPVQhUs35rAeEIAOBVDQpIRUVF3qoDqBebza4Fb693uY1wBADwlAYFpE6dOnmrDqBeVuftcdlOOAIAeFKDL7FJ0v79+/XBBx/o0KFDCgsLU2pqqkaPHq3LLrvM0/UBDu7WOsoclU44AgB4VIMDUk5OjmbPni273a62bdvKGKPjx49rxowZmjNnjqZNm+aNOhHiapuUfcvgnj6uBgAQ7Bp0m//69es1a9YsPfbYYzpx4oSOHj2q4uJiR0CaMWOGNm7c6K1aEaJqm5TNWkcAAG8IMw1Y7XHMmDFKSEjQq6++6nL7/fffr9OnT2vZsmUeK9DTysrKFB8fr9LSUsXFxVldDupgs9l1V9ZrLrcx78j7bDZbjZszUlNTFRERYVFFAEKVr8/fDfrVe9u2bbrnnnvcbr/nnnu0ZcuWJhcFXMCkbACAFRoUkEpKSpSSkuJ2e2pqqmMRSaCpmJQNALBKgwLS+fPnFRUV5XZ7ZGSkKisrm1wUwKRsAICVGnwX2x/+8AfFxsa63Hb69OkmFwQwKRsAYLUGBaSOHTtq8eLFde4DNFZdK2XzAFoAgC80KCAdOnTIS2UA3ys/W+GynUnZAABfatC1inXr1unKK69UWVlZjW2lpaX60Y9+pE2bNnmsOISejTv212hjUjYAwNcaFJDmz5+vSZMmuVx/ID4+Xg888IDmzZvnseIQWmw2u5au2lyj/Yb+XSyoBgAQyhoUkD7//HMNHz7c7fabbrpJO3fubHJRCE1rNu512R7bItrHlQAAQl2D10GKjIx0u71Zs2Y6fvx4k4tC6HE3ejRh9EDuWgMA+FyDzjwdOnTQ3r2uf8uXpC+++ELt2rVrclEIPe4mZ48c1MPHlQAA0MCANHLkSD3++OM6f/58jW3nzp1Tdna2brnlFo8Vh9DhanI2o0cAAKs06Db/WbNm6b333lOXLl00ZcoUde36/Zo0X331lRYuXCibzabHHnvMK4UieLm7vDaoX2cLqgEAoIEBKTExUZs3b9ZDDz2kmTNnyhgjSQoLC9OwYcO0cOFCJSYmeqVQBC93D6RlcjYAwCoNftRIp06dtGbNGv373//WgQMHZIxR586d1apVK2/UhyDn7oG0XF4DAFipwQHpglatWql///6erAUhprYH0jI5GwBgJX5FhyV4IC0AwJ9xFoLP8UBaAIC/IyDB53ggLQDA3xGQ4Bd4IC0AwJ8QkOAXeCAtAMCfEJDgc65WzQYAwJ8QkOBT7lbNBgDAnxCQ4FPuJmizajYAwJ8QkOBTPJQWABAIOCvBZ3goLQAgUBCQ4DNcXgMABAoCEnyGy2sAgEDBmQk+weU1AEAg8YuAtHDhQqWkpCgmJkZpaWnatm1bvd63fPlyhYWFafTo0d4tEE22Om+Py3YurwEA/JHlAWnFihXKyspSdna2PvvsM/Xu3VvDhg3TsWPHan3foUOHNG3aNF1//fU+qhSNlbtln978IL9GO5fXAAD+yvKz07x58zRp0iRNnDhRV155pRYtWqQWLVpoyZIlbt9js9l0991368knn9Rll13mw2rRUOu3FujlZXkut40c1MPH1QAAUD+WBqTKykrt3LlTGRkZjrbw8HBlZGQoP7/miMMFTz31lNq2bat77723zu+oqKhQWVmZ0wu+YbPZteDt9S63TRk3hNEjAIDfsvQMdeLECdlsNiUmJjq1JyYmqri42OV7PvnkE73++utavHhxvb4jJydH8fHxjldycnKT60b9rNm412X7w2MHa0haVx9XAwBA/QXUr/CnT5/WPffco8WLF6tNmzb1es/MmTNVWlrqeB05csTLVUJyf9da5qh0Db2muwUVAQBQf82s/PI2bdooIiJCJSUlTu0lJSVKSkqqsf/Bgwd16NAh3XrrrY42u90uSWrWrJkKCgp0+eWXO70nOjpa0dHcKeVr7u5au2VwTx9XAgBAw1k6ghQVFaW+ffsqNzfX0Wa325Wbm6v09PQa+3fr1k179uzR7t27Ha/bbrtNQ4YM0e7du7l85ifWby3grjUAQECzdARJkrKysjR+/Hj169dPAwYM0Pz583XmzBlNnDhRkpSZmakOHTooJydHMTEx6tHD+c6nhIQESarRDmvUNjGbu9YAAIHC8oA0ZswYHT9+XLNnz1ZxcbH69OmjtWvXOiZuHz58WOHhjDoECncTs7lrDQAQSMKMMcbqInyprKxM8fHxKi0tVVxcnNXlBBWbza67sl6r0Z45Kl2jbuxtQUVoKpvNpqKiIqe21NRURUREWFQRgFDl6/M3v9LDY8rPVrhsZ2I2ACDQEJDgMRt37K/RxsRsAEAg4swFj3C37tGgfp0tqAYAgKYhIMEj3K17FNuCNagAAIGHgIQmY90jAECw4eyFJmHdIwBAMCIgoUnc3bnGukcAgEDGGQyNZrPZtXrDFzXaM0ela0haVwsqAgDAMyxfSRuBKW97oV5duUkVlVU1tt3Qv4sFFQEA4DmMIKHBbDa723AkcecaACDwEZDQYGs27nUbjph7BAAIBpzJ0CDuFoSUvg9HzD0CAAQD5iChQdZs3OuyfflzkxQZyQNMAQDBgREk1Ju70aMJowcSjgAAQYWAhHpzt+YRC0ICAIINAQlNwuNEAADBiDMbmmRQv85WlwAAgMcRkFBvG3fst7oEAAB8goCEeqnt9n4AAIINAQn14u72flbNBgAEIwIS6lTb7f1M0AYABCPObqiTu9Ejbu8HAAQrAhJqxegRACAUcYZDrRg9AgCEIgIS3GL0CAAQqjjLwa3VeXtctjN6BAAIdgQkuJS7ZZ/e/CC/RjujRwCAUMCZDjXkbtmnl5fludzG6BEAIBQQkOBk/dYCt+FoyrghjB4BAEICZzs42Gx2LXh7vcttD48drCFpXX1cEQAA1iAgwaH8bIXL9ofHDtbQa7r7uBoAAKxDQILDhu2FNdoyR6UTjgAAIYeABEnu71q7oX8XC6oBAMBaBCTUetdabItoH1cDAID1CEghrrZwxF1rAIBQxdkvhNUWjrhrDQAQyghIIaqucMTEbABAKCMghSDCEQAAtSMghZjaVsomHAEA8D0CUgipa6VswhEAAN8jIIUQVsoGAKB+CEghjpWyAQCoiYAUQjbu2F+jjZWyAQCoiYAUImw2u5au2mx1GQAABAQCUohYs3Gvy3YeJQIAQE0EpBDgbvRowuiBPEoEAAAXODuGAHd3r40c1MPHlQAAEBgISCHAZrfXaGP0CAAA95pZXQC8K297oV58a12N9kH9OltQDQAAgYEhhCBms9n16spNVpcBAEDAISAFsTUb96qisqpGe4uYKO5eAwCgFgSkIFXbukf33Xkd848AAKgFc5CClLt1j5Y/N0mRkRE+rgYAgMDCMEIQqm3dI8IRAAB1IyAFIdY9AgCgaQhIIYJ1jwAAqD+/OGMuXLhQKSkpiomJUVpamrZt2+Z238WLF+v6669Xq1at1KpVK2VkZNS6fyjasL2wRhvrHgEAUH+WB6QVK1YoKytL2dnZ+uyzz9S7d28NGzZMx44dc7n/hg0bNHbsWK1fv175+flKTk7WTTfdpG+++cbHlfun9VsL9OYH+VaXAQBAQAszxhgrC0hLS1P//v21YMECSZLdbldycrJ+8YtfaMaMGXW+32azqVWrVlqwYIEyMzPr3L+srEzx8fEqLS1VXFxck+v3JzabXXdlveZy28p593OJDQ1ms9lUVFTk1JaamqqICCb7A/AtX5+/LT1jVlZWaufOncrIyHC0hYeHKyMjQ/n59RsFOXv2rKqqqtS6dWuX2ysqKlRWVub0ClbuJmdPGTeEcAQAQANYetY8ceKEbDabEhMTndoTExNVXFxcr8+YPn262rdv7xSyfignJ0fx8fGOV3JycpPr9lcbd+yv0ZY5Kl1D0rpaUA0AAIEroIcV5s6dq+XLl+v9999XTEyMy31mzpyp0tJSx+vIkSM+rtI33K19dEP/LhZUAwBAYLN0Je02bdooIiJCJSUlTu0lJSVKSkqq9b3PPfec5s6dq48//li9evVyu190dLSio4P/uWPuVs7mmWsAADScpSNIUVFR6tu3r3Jzcx1tdrtdubm5Sk9Pd/u+Z555Rk8//bTWrl2rfv36+aJUv1bbytnMPQIAoOEsfxZbVlaWxo8fr379+mnAgAGaP3++zpw5o4kTJ0qSMjMz1aFDB+Xk5EiS/ud//kezZ8/W22+/rZSUFMdcpdjYWMXGxlrWDyu5Gz1i5WwAABrH8oA0ZswYHT9+XLNnz1ZxcbH69OmjtWvXOiZuHz58WOHh/xkFeeWVV1RZWak777zT6XOys7P1xBNP+LJ0v8DoEQAAnmd5QJKkKVOmaMqUKS63bdiwwenPhw4d8n5BAYTnrgEA4HkMMQQ4V48VYfQIAICm4SwawNw9VoTnrgEA0DQEpABls9m14O31Lrdxaz8AAE1DQApQq/P2uGznsSIAADQdZ9IA5O7SGo8VAQDAMwhIAaa2S2u3DO7p42oAAAhOBKQA4+62fi6tAQDgOZxRA8zGHftrtHFpDQAAzyIgBRB3q2bf0L+LBdUAABC8CEgBpLT8nMt2busHAMCz/OJRI6hb3vZCvfjWuhrtrJoNAIDncWYNADabXa+u3ORyG6tmAwDgeQSkALA6b48qKqtqtLeIieLyGgAAXkBA8nPuFoWUpPvuvI7LawAAeAFzkPxYbYtCLn9ukiIjI3xcEQAAoYHhBz9W26KQhCMAALyHgOTHNmwvrNHGopAAAHgfAclPuZt7xKKQAAB4HwHJD9U294i71gAA8D4Ckh/igbQAAFiLs60f4oG0AABYi4DkZ3ggLQAA1iMg+Zk1G/e6bGfuEQAAvkNA8iPuRo94IC0AAL7FWdePuBs9Gjmoh48rAQAgtBGQ/ASjRwAA+A/OvH7C3a39jB4BAOB7BCQ/4erWfkaPAACwBmdfP+Du8tqgfp0tqAYAABCQ/AC39gMA4F8ISBZjcjYAAP6HM7DFuLUfAAD/Q0CyEKNHAAD4J87CFiotP+eyndEjAACs1czqAkJV3vZCvfjWuhrtjB4BAGA9zsQWsNnsenXlJpfbuLUfAADrEZAssGbjXlVUVtVobxETxa39AAD4AQKSj7mbmC1J9915HZfXAADwA8xB8jF3t/Uvf26SIiMjfFwNAABwheEKH6rttn7CEQAA/oOA5EMsCgkAQGAgIPkIi0ICABA4ODP7SPnZCpftjB4BAOB/CEgWYvQIAAD/xNnZRzZsL6zRxqKQAAD4JwKSD6zfWqA3P8i3ugwAAFBPBCQvs9nsWvD2epfbWDUbAAD/REDyMneTs6eMG8L8IwAA/BRnaC9zNfcoc1S6hqR1taAaAABQHwQkL8rdss/l3KMb+nexoBoAAFBfBCQvWb+1QC8vy3O5jblHAAD4NwKSF9Q2MZu5RwAA+D/O1F6wOm+Py/aHxw5m7hEAAAGAgORh7tY8yhyVrqHXdLegIgAA0FAEJA+q7dLaLYN7+rgaAADQWAQkD2LNIwAAgoNfnLUXLlyolJQUxcTEKC0tTdu2bat1/3feeUfdunVTTEyMevbsqTVr1vio0tqx5hEAAMHB8oC0YsUKZWVlKTs7W5999pl69+6tYcOG6dixYy7337x5s8aOHat7771Xu3bt0ujRozV69Gjt3bvXx5U7czf3iDWPAAAIPGHGGGNlAWlpaerfv78WLFggSbLb7UpOTtYvfvELzZgxo8b+Y8aM0ZkzZ7R69WpH2zXXXKM+ffpo0aJFdX5fWVmZ4uPjVVpaqri4OI/0wWaz666s11xuWznvfi6vIWDZbDYVFRU5taWmpioiIsKiigCEKm+cv2tj6Zm7srJSO3fuVEZGhqMtPDxcGRkZys+vORojSfn5+U77S9KwYcPc7l9RUaGysjKnl6cx9wgAgOBi6dn7xIkTstlsSkxMdGpPTExUcXGxy/cUFxc3aP+cnBzFx8c7XsnJyZ4pvg7MPQIAIHAF/fDGzJkzVVpa6ngdOXLEJ9/L3CMAAAJXMyu/vE2bNoqIiFBJSYlTe0lJiZKSkly+JykpqUH7R0dHKzrau88+i4uN0ZLfja/RBgS68PBwpaam1mgDgGBn6U+6qKgo9e3bV7m5uY42u92u3Nxcpaenu3xPenq60/6S9NFHH7nd3xfCwsIU37K50yssLMyyegBPCQsLU0REhNOLv9sAQoGlI0iSlJWVpfHjx6tfv34aMGCA5s+frzNnzmjixImSpMzMTHXo0EE5OTmSpKlTp2rw4MF6/vnndfPNN2v58uXasWOHXnvN9V1kAAAADWV5QBozZoyOHz+u2bNnq7i4WH369NHatWsdE7EPHz7sNKQ/cOBAvf3225o1a5Z++9vfqnPnzlq1apV69OhhVRcAAECQsXwdJF/z9ToKAACg6UJqHSQAAAB/REACAACohoAEAABQDQEJAACgGgISAABANQQkAACAaghIAAAA1RCQAAAAqiEgAQAAVGP5o0Z87cLC4WVlZRZXAgAA6uvCedtXDwAJuYB0+vRpSVJycrLFlQAAgIY6ffq04uPjvf49IfcsNrvdrm+//VYtW7ZUWFiYRz+7rKxMycnJOnLkSFA/5y1U+inR12BFX4MTfQ0+P+xny5Ytdfr0abVv397pIfbeEnIjSOHh4br00ku9+h1xcXFB/Rf2glDpp0RfgxV9DU70Nfhc6KcvRo4uYJI2AABANQQkAACAaghIHhQdHa3s7GxFR0dbXYpXhUo/JfoarOhrcKKvwcfKfobcJG0AAIC6MIIEAABQDQEJAACgGgISAABANQQkAACAaghIP7Bw4UKlpKQoJiZGaWlp2rZtW637v/POO+rWrZtiYmLUs2dPrVmzxmm7MUazZ89Wu3bt1Lx5c2VkZGj//v1O+5w8eVJ333234uLilJCQoHvvvVfl5eUe71t1nuxrVVWVpk+frp49e+qiiy5S+/btlZmZqW+//dbpM1JSUhQWFub0mjt3rlf6d4Gnj+mECRNq9GH48OFO+wTDMZVUo58XXs8++6xjHyuOqdSwvv7973/XHXfc4ah1/vz5jfrM8+fPa/Lkybr44osVGxurO+64QyUlJZ7sVqPq+qH69DUnJ0f9+/dXy5Yt1bZtW40ePVoFBQVO+9xwww01juuDDz7o6a7V4Om+PvHEEzX60a1bN6d9guW4uvq3GBYWpsmTJzv2CYTjunjxYl1//fVq1aqVWrVqpYyMjBr7++zcamCMMWb58uUmKirKLFmyxPz97383kyZNMgkJCaakpMTl/p9++qmJiIgwzzzzjPnyyy/NrFmzTGRkpNmzZ49jn7lz55r4+HizatUq8/nnn5vbbrvNpKammnPnzjn2GT58uOndu7fZsmWL2bRpk7niiivM2LFjA6qvp06dMhkZGWbFihXmq6++Mvn5+WbAgAGmb9++Tp/TqVMn89RTT5mjR486XuXl5QHTT2OMGT9+vBk+fLhTH06ePOn0OcFwTI0xTn08evSoWbJkiQkLCzMHDx507OPrY9qYvm7bts1MmzbNLFu2zCQlJZkXXnihUZ/54IMPmuTkZJObm2t27NhhrrnmGjNw4EBvdbPedf1Qffo6bNgw88Ybb5i9e/ea3bt3m5EjR5qOHTs6HbfBgwebSZMmOR3X0tJSb3XTGOOdvmZnZ5sf/ehHTv04fvy40z7BclyPHTvm1M+PPvrISDLr16937BMIx3XcuHFm4cKFZteuXWbfvn1mwoQJJj4+3vzzn/907OOrcysB6f8bMGCAmTx5suPPNpvNtG/f3uTk5Ljc/6677jI333yzU1taWpp54IEHjDHG2O12k5SUZJ599lnH9lOnTpno6GizbNkyY4wxX375pZFktm/f7tjnL3/5iwkLCzPffPONx/pWnaf76sq2bduMJPP111872jp16uTyH7a3eKOf48ePN6NGjXL7ncF8TEeNGmVuvPFGpzZfH1NjGt7XH3JXb12feerUKRMZGWneeecdxz779u0zkkx+fn4TelM7b/S1umPHjhlJJi8vz9E2ePBgM3Xq1MaU3Gje6Gt2drbp3bu32/cF83GdOnWqufzyy43dbne0BdpxNcaY7777zrRs2dL88Y9/NMb49tzKJTZJlZWV2rlzpzIyMhxt4eHhysjIUH5+vsv35OfnO+0vScOGDXPsX1RUpOLiYqd94uPjlZaW5tgnPz9fCQkJ6tevn2OfjIwMhYeHa+vWrR7r3w95o6+ulJaWKiwsTAkJCU7tc+fO1cUXX6yrrrpKzz77rL777rvGd6YW3uznhg0b1LZtW3Xt2lUPPfSQ/vWvfzl9RjAe05KSEn344Ye69957a2zz1TGVGtdXT3zmzp07VVVV5bRPt27d1LFjx0Z/ryfq8oTS0lJJUuvWrZ3a//SnP6lNmzbq0aOHZs6cqbNnz3rsO6vzZl/379+v9u3b67LLLtPdd9+tw4cPO7YF63GtrKzUW2+9pZ///Oc1HsoeaMf17Nmzqqqqcvz99OW5NeQeVuvKiRMnZLPZlJiY6NSemJior776yuV7iouLXe5fXFzs2H6hrbZ92rZt67S9WbNmat26tWMfT/NGX6s7f/68pk+frrFjxzo9RPGXv/ylrr76arVu3VqbN2/WzJkzdfToUc2bN6+JvarJW/0cPny4br/9dqWmpurgwYP67W9/qxEjRig/P18RERFBe0z/+Mc/qmXLlrr99tud2n15TKXG9dUTn1lcXKyoqKgagb+2/8+ayht9rc5ut+uRRx7Rtddeqx49ejjax40bp06dOql9+/b64osvNH36dBUUFOi9997zyPdW562+pqWlaenSperatauOHj2qJ598Utdff7327t2rli1bBu1xXbVqlU6dOqUJEyY4tQficZ0+fbrat2/vCES+PLcSkOBRVVVVuuuuu2SM0SuvvOK0LSsry/HfvXr1UlRUlB544AHl5OQEzHL5P/3pTx3/3bNnT/Xq1UuXX365NmzYoKFDh1pYmXctWbJEd999t2JiYpzag+GYhrLJkydr7969+uSTT5za77//fsd/9+zZU+3atdPQoUN18OBBXX755b4us9FGjBjh+O9evXopLS1NnTp10sqVK12OhgaL119/XSNGjFD79u2d2gPtuM6dO1fLly/Xhg0bavzs8QUusUlq06aNIiIiaty5UFJSoqSkJJfvSUpKqnX/C/9b1z7Hjh1z2v7dd9/p5MmTbr+3qbzR1wsuhKOvv/5aH330kdPokStpaWn67rvvdOjQoYZ3pA7e7OcPXXbZZWrTpo0OHDjg+IxgOqaStGnTJhUUFOi+++6rsxZvHlOpcX31xGcmJSWpsrJSp06d8tj3eqKuppgyZYpWr16t9evX69JLL61137S0NEly/D33NG/39YKEhAR16dLF6d9rsB3Xr7/+Wh9//HG9/71K/nlcn3vuOc2dO1d/+9vf1KtXL0e7L8+tBCRJUVFR6tu3r3Jzcx1tdrtdubm5Sk9Pd/me9PR0p/0l6aOPPnLsn5qaqqSkJKd9ysrKtHXrVsc+6enpOnXqlHbu3OnYZ926dbLb7Y6/uJ7mjb5K/wlH+/fv18cff6yLL764zlp2796t8PDwGkOhnuCtflb3z3/+U//617/Url07x2cEyzG94PXXX1ffvn3Vu3fvOmvx5jGVGtdXT3xm3759FRkZ6bRPQUGBDh8+3Ojv9URdjWGM0ZQpU/T+++9r3bp1Sk1NrfM9u3fvliTH33NP81ZfqysvL9fBgwcd/Qim43rBG2+8obZt2+rmm2+uc19/Pa7PPPOMnn76aa1du9ZpHpHk43NrvadzB7nly5eb6Ohos3TpUvPll1+a+++/3yQkJJji4mJjjDH33HOPmTFjhmP/Tz/91DRr1sw899xzZt++fSY7O9vlbf4JCQnmgw8+MF988YUZNWqUy1sRr7rqKrN161bzySefmM6dO/vklnBP9rWystLcdttt5tJLLzW7d+92uoW0oqLCGGPM5s2bzQsvvGB2795tDh48aN566y1zySWXmMzMzIDp5+nTp820adNMfn6+KSoqMh9//LG5+uqrTefOnc358+cdnxMMx/SC0tJS06JFC/PKK6/U+E4rjmlj+lpRUWF27dpldu3aZdq1a2emTZtmdu3aZfbv31/vzzTm+9vBO3bsaNatW2d27Nhh0tPTTXp6esD19aGHHjLx8fFmw4YNTv9Wz549a4wx5sCBA+app54yO3bsMEVFReaDDz4wl112mRk0aFDA9fXXv/612bBhgykqKjKffvqpycjIMG3atDHHjh1z7BMsx9WY7+8Q69ixo5k+fXqN7wyU4zp37lwTFRVl3n33Xae/n6dPn3baxxfnVgLSD7z00kumY8eOJioqygwYMMBs2bLFsW3w4MFm/PjxTvuvXLnSdOnSxURFRZkf/ehH5sMPP3TabrfbzeOPP24SExNNdHS0GTp0qCkoKHDa51//+pcZO3asiY2NNXFxcWbixIlOfxG8xZN9LSoqMpJcvi6swbFz506TlpZm4uPjTUxMjOnevbuZM2eOU7Dw936ePXvW3HTTTeaSSy4xkZGRplOnTmbSpElOJ1FjguOYXvDqq6+a5s2bm1OnTtXYZtUxNaZhfXX393Pw4MH1/kxjjDl37px5+OGHTatWrUyLFi3MT37yE3P06FFvdrPOuhrTV3f/Vt944w1jjDGHDx82gwYNMq1btzbR0dHmiiuuML/5zW+8vl6ON/o6ZswY065dOxMVFWU6dOhgxowZYw4cOOD0ncFyXI0x5q9//auRVOM8Y0zgHNdOnTq57Gt2drZjH1+dW8OMMab+400AAADBjzlIAAAA1RCQAAAAqiEgAQAAVENAAgAAqIaABAAAUA0BCQAAoBoCEgAAQDUEJAAAgGoISADwA2FhYVq1alWj3//EE0+oT58+HqsHgDUISADq5YYbbtAjjzzi0c+cMGGCRo8ebclnuAsyR48e1YgRI+r1Ga7C1LRp02o8CBhA4GlmdQEA4E+SkpKa9P7Y2FjFxsZ6qBoAlmnEc+cAhJjx48fXeHhkUVGRMcaYPXv2mOHDh5uLLrrItG3b1vzsZz8zx48fd7z3nXfeMT169DAxMTGmdevWZujQoaa8vNxkZ2e7fbhxdY35jEcffdR07tzZNG/e3KSmpppZs2aZyspKY4wxb7zxhtuHtUoy77//vjHm+6eoT5482SQlJZno6GjTsWNHM2fOHGNMzYdqdurUyRhjTHZ2tundu7dT/a+//rq58sorTVRUlElKSjKTJ09u8jEB4F2MIAGo0+9//3sVFhaqR48eeuqppyRJl1xyiU6dOqUbb7xR9913n1544QWdO3dO06dP11133aV169bp6NGjGjt2rJ555hn95Cc/0enTp7Vp0yYZYzRt2jTt27dPZWVleuONNyRJrVu3rvHdjf2Mli1baunSpWrfvr327NmjSZMmqWXLlnr00Uc1ZswY7d27V2vXrtXHH38sSYqPj6/x3S+++KL+/Oc/a+XKlerYsaOOHDmiI0eOSJK2b9+utm3b6o033tDw4cMVERHh8v+7V155RVlZWZo7d65GjBih0tJSffrpp008IgC8jYAEoE7x8fGKiopSixYtnC5BLViwQFdddZXmzJnjaFuyZImSk5NVWFio8vJyfffdd7r99tvVqVMnSVLPnj0d+zZv3lwVFRW1XtY6evRooz5j1qxZjv9OSUnRtGnTtHz5cj366KNq3ry5YmNj1axZs1q/+/Dhw+rcubOuu+46hYWFOb5f+j4gSlJCQkKtn/G73/1Ov/71rzV16lRHW//+/d3uD8A/MEkbQKN9/vnnWr9+vWPeTWxsrLp16yZJOnjwoHr37q2hQ4eqZ8+e+q//+i8tXrxY//73vxv0HY39jBUrVujaa69VUlKSYmNjNWvWLB0+fLhB3z1hwgTt3r1bXbt21S9/+Uv97W9/a9D7jx07pm+//VZDhw5t0PsAWI+ABKDRysvLdeutt2r37t1Or/3792vQoEGKiIjQRx99pL/85S+68sor9dJLL6lr164qKiqq93c05jPy8/N19913a+TIkVq9erV27dqlxx57TJWVlQ3q39VXX62ioiI9/fTTOnfunO666y7deeed9X5/8+bNG/R9APwHAQlAvURFRclmszm1XX311fr73/+ulJQUXXHFFU6viy66SNL3t8Jfe+21evLJJ7Vr1y5FRUXp/fffd/uZrjT0MzZv3qxOnTrpscceU79+/dS5c2d9/fXXdfbHlbi4OI0ZM0aLFy/WihUr9L//+786efKkJCkyMrLWz2jZsqVSUlK47R8IQAQkAPWSkpKirVu36tChQzpx4oTsdrsmT56skydPauzYsdq+fbsOHjyov/71r5o4caJsNpu2bt2qOXPmaMeOHTp8+LDee+89HT9+XN27d3d85hdffKGCggKdOHFCVVVVNb63MZ/RuXNnHT58WMuXL9fBgwf14osvOgLVD/tTVFSk3bt368SJE6qoqKjx3fPmzdOyZcv01VdfqbCwUO+8846SkpKUkJDg+Izc3FwVFxe7vez3xBNP6Pnnn9eLL76o/fv367PPPtNLL73UlEMBwBesvo0OQGAoKCgw11xzjWnevLnTbf6FhYXmJz/5iUlISDDNmzc33bp1M4888oix2+3myy+/NMOGDTOXXHKJiY6ONl26dDEvvfSS4zOPHTtmfvzjH5vY2Fi3t/k39jN+85vfmIsvvtjExsaaMWPGmBdeeMHEx8c73nf+/Hlzxx13mISEBLe3+b/22mumT58+5qKLLjJxcXFm6NCh5rPPPnN8xp///GdzxRVXmGbNmtV6m/+iRYtM165dTWRkpGnXrp35xS9+0ahjAMB3wowxxtKEBgAA4Ge4xAYAAFANAQkAAKAaAhIAAEA1BCQAAIBqCEgAAADVEJAAAACqISABAABUQ0ACAACohoAEAABQDQEJAACgGgISAABANf8PitOQj3Y7AOsAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "<Figure size 800x600 with 0 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "ht.PlotCdf()\n",
    "thinkplot.Show(xlabel='test statistic',\n",
    "               ylabel='CDF')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eb53d0c2",
   "metadata": {},
   "source": [
    "Figure [\\[hypothesis1\\]](#hypothesis1){reference-type=\"ref\"\n",
    "reference=\"hypothesis1\"} shows the result. The CDF intersects the\n",
    "observed difference at 0.83, which is the complement of the p-value,\n",
    "0.17.\n",
    "\n",
    "If we run the same analysis with birth weight, the computed p-value is\n",
    "0; after 1000 attempts, the simulation never yields an effect as big as\n",
    "the observed difference, 0.12 lbs. So we would report $p < 0.001$, and\n",
    "conclude that the difference in birth weight is statistically\n",
    "significant."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "546e010e",
   "metadata": {},
   "source": [
    "## Other test statistics\n",
    "\n",
    "Choosing the best test statistic depends on what question you are trying\n",
    "to address. For example, if the relevant question is whether pregnancy\n",
    "lengths are different for first babies, then it makes sense to test the\n",
    "absolute difference in means, as we did in the previous section.\n",
    "\n",
    "If we had some reason to think that first babies are likely to be late,\n",
    "then we would not take the absolute value of the difference; instead we\n",
    "would use this test statistic:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "d05b7efa",
   "metadata": {},
   "outputs": [],
   "source": [
    "class DiffMeansOneSided(DiffMeansPermute):\n",
    "\n",
    "    def TestStatistic(self, data):\n",
    "        group1, group2 = data\n",
    "        test_stat = group1.mean() - group2.mean()\n",
    "        return test_stat"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "360460b0",
   "metadata": {},
   "source": [
    "`DiffMeansOneSided` inherits `MakeModel` and `RunModel` from\n",
    "`DiffMeansPermute`; the only difference is that `TestStatistic` does not\n",
    "take the absolute value of the difference. This kind of test is called\n",
    "**one-sided** because it only counts one side of the distribution of\n",
    "differences. The previous test, using both sides, is **two-sided**.\n",
    "\n",
    "For this version of the test, the p-value is 0.09. In general the\n",
    "p-value for a one-sided test is about half the p-value for a two-sided\n",
    "test, depending on the shape of the distribution.\n",
    "\n",
    "The one-sided hypothesis, that first babies are born late, is more\n",
    "specific than the two-sided hypothesis, so the p-value is smaller. But\n",
    "even for the stronger hypothesis, the difference is not statistically\n",
    "significant."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b5cf8235",
   "metadata": {},
   "source": [
    "We can use the same framework to test for a difference in standard\n",
    "deviation. In\n",
    "Section [\\[visualization\\]](#visualization){reference-type=\"ref\"\n",
    "reference=\"visualization\"}, we saw some evidence that first babies are\n",
    "more likely to be early or late, and less likely to be on time. So we\n",
    "might hypothesize that the standard deviation is higher. Here's how we\n",
    "can test that:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "c66134af",
   "metadata": {},
   "outputs": [],
   "source": [
    "class DiffStdPermute(DiffMeansPermute):\n",
    "\n",
    "    def TestStatistic(self, data):\n",
    "        group1, group2 = data\n",
    "        test_stat = group1.std() - group2.std()\n",
    "        return test_stat"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8112432b",
   "metadata": {},
   "source": [
    "This is a one-sided test because the hypothesis is that the standard\n",
    "deviation for first babies is higher, not just different. The p-value is\n",
    "0.09, which is not statistically significant."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7efb6d21",
   "metadata": {},
   "source": [
    "## Testing a correlation\n",
    "\n",
    "This framework can also test correlations. For example, in the NSFG data\n",
    "set, the correlation between birth weight and mother's age is about\n",
    "0.07. It seems like older mothers have heavier babies. But could this\n",
    "effect be due to chance?\n",
    "\n",
    "For the test statistic, I use Pearson's correlation, but Spearman's\n",
    "would work as well. If we had reason to expect positive correlation, we\n",
    "would do a one-sided test. But since we have no such reason, I'll do a\n",
    "two-sided test using the absolute value of correlation.\n",
    "\n",
    "The null hypothesis is that there is no correlation between mother's age\n",
    "and birth weight. By shuffling the observed values, we can simulate a\n",
    "world where the distributions of age and birth weight are the same, but\n",
    "where the variables are unrelated:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "e34d9386",
   "metadata": {},
   "outputs": [],
   "source": [
    "class CorrelationPermute(thinkstats2.HypothesisTest):\n",
    "\n",
    "    def TestStatistic(self, data):\n",
    "        xs, ys = data\n",
    "        test_stat = abs(thinkstats2.Corr(xs, ys))\n",
    "        return test_stat\n",
    "\n",
    "    def RunModel(self):\n",
    "        xs, ys = self.data\n",
    "        xs = np.random.permutation(xs)\n",
    "        return xs, ys"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dc1c7f94",
   "metadata": {},
   "source": [
    "`data` is a pair of sequences. `TestStatistic` computes the absolute\n",
    "value of Pearson's correlation. `RunModel` shuffles the `xs` and returns\n",
    "simulated data.\n",
    "\n",
    "Here's the code that reads the data and runs the test:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "782a7885",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.0"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "live, firsts, others = first.MakeFrames()\n",
    "live = live.dropna(subset=['agepreg', 'totalwgt_lb'])\n",
    "data = live.agepreg.values, live.totalwgt_lb.values\n",
    "ht = CorrelationPermute(data)\n",
    "p_value = ht.PValue()\n",
    "p_value"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb204285",
   "metadata": {},
   "source": [
    "I use `dropna` with the `subset` argument to drop rows that are missing\n",
    "either of the variables we need.\n",
    "\n",
    "The actual correlation is 0.07. The computed p-value is 0; after 1000\n",
    "iterations the largest simulated correlation is 0.04. So although the\n",
    "observed correlation is small, it is statistically significant.\n",
    "\n",
    "This example is a reminder that \"statistically significant\" does not\n",
    "always mean that an effect is important, or significant in practice. It\n",
    "only means that it is unlikely to have occurred by chance."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd1b7cdf",
   "metadata": {},
   "source": [
    "## Testing proportions\n",
    "\n",
    "Suppose you run a casino and you suspect that a customer is using a\n",
    "crooked die; that is, one that has been modified to make one of the\n",
    "faces more likely than the others. You apprehend the alleged cheater and\n",
    "confiscate the die, but now you have to prove that it is crooked. You\n",
    "roll the die 60 times and get the following results:\n",
    "\n",
    "```\n",
    " center\n",
    "  Value        1   2   3    4   5   6\n",
    "  ----------- --- --- ---- --- --- ----\n",
    "  Frequency    8   9   19   5   8   11\n",
    "```\n",
    "\n",
    "On average you expect each value to appear 10 times. In this dataset,\n",
    "the value 3 appears more often than expected, and the value 4 appears\n",
    "less often. But are these differences statistically significant?\n",
    "\n",
    "To test this hypothesis, we can compute the expected frequency for each\n",
    "value, the difference between the expected and observed frequencies, and\n",
    "the total absolute difference. In this example, we expect each side to\n",
    "come up 10 times out of 60; the deviations from this expectation are -2,\n",
    "-1, 9, -5, -2, and 1; so the total absolute difference is 20. How often\n",
    "would we see such a difference by chance?\n",
    "\n",
    "Here's a version of `HypothesisTest` that answers that question:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "9a530baf",
   "metadata": {},
   "outputs": [],
   "source": [
    "class DiceTest(thinkstats2.HypothesisTest):\n",
    "\n",
    "    def TestStatistic(self, data):\n",
    "        observed = data\n",
    "        n = sum(observed)\n",
    "        expected = np.ones(6) * n / 6\n",
    "        test_stat = sum(abs(observed - expected))\n",
    "        return test_stat\n",
    "\n",
    "    def RunModel(self):\n",
    "        n = sum(self.data)\n",
    "        values = [1, 2, 3, 4, 5, 6]\n",
    "        rolls = np.random.choice(values, n, replace=True)\n",
    "        hist = thinkstats2.Hist(rolls)\n",
    "        freqs = hist.Freqs(values)\n",
    "        return freqs"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fa590a55",
   "metadata": {},
   "source": [
    "The data are represented as a list of frequencies: the observed values\n",
    "are `[8, 9, 19, 5, 8, 11]`; the expected frequencies are all 10. The\n",
    "test statistic is the sum of the absolute differences.\n",
    "\n",
    "The null hypothesis is that the die is fair, so we simulate that by\n",
    "drawing random samples from `values`. `RunModel` uses `Hist` to compute\n",
    "and return the list of frequencies.\n",
    "\n",
    "The p-value for this data is 0.13, which means that if the die is fair\n",
    "we expect to see the observed total deviation, or more, about 13% of the\n",
    "time. So the apparent effect is not statistically significant."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "544584d8",
   "metadata": {},
   "source": [
    "## Chi-squared tests\n",
    "\n",
    "In the previous section we used total deviation as the test statistic.\n",
    "But for testing proportions it is more common to use the chi-squared\n",
    "statistic: $$\\chi^2 = \\sum_i \\frac{(O_i - E_i)^2}{E_i}$$ Where $O_i$\n",
    "are the observed frequencies and $E_i$ are the expected frequencies.\n",
    "Here's the Python code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "398ea4cf",
   "metadata": {},
   "outputs": [],
   "source": [
    "class DiceChiTest(DiceTest):\n",
    "\n",
    "    def TestStatistic(self, data):\n",
    "        observed = data\n",
    "        n = sum(observed)\n",
    "        expected = np.ones(6) * n / 6\n",
    "        test_stat = sum((observed - expected)**2 / expected)\n",
    "        return test_stat"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cc8c586e",
   "metadata": {},
   "source": [
    "Squaring the deviations (rather than taking absolute values) gives more\n",
    "weight to large deviations. Dividing through by `expected` standardizes\n",
    "the deviations, although in this case it has no effect because the\n",
    "expected frequencies are all equal.\n",
    "\n",
    "The p-value using the chi-squared statistic is 0.04, substantially\n",
    "smaller than what we got using total deviation, 0.13. If we take the 5%\n",
    "threshold seriously, we would consider this effect statistically\n",
    "significant. But considering the two tests togther, I would say that the\n",
    "results are borderline. I would not rule out the possibility that the\n",
    "die is crooked, but I would not convict the accused cheater.\n",
    "\n",
    "This example demonstrates an important point: the p-value depends on the\n",
    "choice of test statistic and the model of the null hypothesis, and\n",
    "sometimes these choices determine whether an effect is statistically\n",
    "significant or not."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a41e6d28",
   "metadata": {},
   "source": [
    "## First babies again\n",
    "\n",
    "Earlier in this chapter we looked at pregnancy lengths for first babies\n",
    "and others, and concluded that the apparent differences in mean and\n",
    "standard deviation are not statistically significant. But in\n",
    "Section [\\[visualization\\]](#visualization){reference-type=\"ref\"\n",
    "reference=\"visualization\"}, we saw several apparent differences in the\n",
    "distribution of pregnancy length, especially in the range from 35 to 43\n",
    "weeks. To see whether those differences are statistically significant,\n",
    "we can use a test based on a chi-squared statistic.\n",
    "\n",
    "The code combines elements from previous examples:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "890dc8a9",
   "metadata": {},
   "outputs": [],
   "source": [
    "class PregLengthTest(thinkstats2.HypothesisTest):\n",
    "\n",
    "    def MakeModel(self):\n",
    "        firsts, others = self.data\n",
    "        self.n = len(firsts)\n",
    "        self.pool = np.hstack((firsts, others))\n",
    "\n",
    "        pmf = thinkstats2.Pmf(self.pool)\n",
    "        self.values = range(35, 44)\n",
    "        self.expected_probs = np.array(pmf.Probs(self.values))\n",
    "\n",
    "    def RunModel(self):\n",
    "        np.random.shuffle(self.pool)\n",
    "        data = self.pool[:self.n], self.pool[self.n:]\n",
    "        return data\n",
    "\n",
    "    def TestStatistic(self, data):\n",
    "        firsts, others = data\n",
    "        stat = self.ChiSquared(firsts) + self.ChiSquared(others)\n",
    "        return stat\n",
    "\n",
    "    def ChiSquared(self, lengths):\n",
    "        hist = thinkstats2.Hist(lengths)\n",
    "        observed = np.array(hist.Freqs(self.values))\n",
    "        expected = self.expected_probs * len(lengths)\n",
    "        stat = sum((observed - expected)**2 / expected)\n",
    "        return stat"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dcc044ef",
   "metadata": {},
   "source": [
    "The data are represented as two lists of pregnancy lengths. The null\n",
    "hypothesis is that both samples are drawn from the same distribution.\n",
    "`MakeModel` models that distribution by pooling the two samples using\n",
    "`hstack`. Then `RunModel` generates simulated data by shuffling the\n",
    "pooled sample and splitting it into two parts.\n",
    "\n",
    "`MakeModel` also defines `values`, which is the range of weeks we'll\n",
    "use, and `expected_probs`, which is the probability of each value in the\n",
    "pooled distribution.\n",
    "\n",
    "Here's the code that computes the test statistic:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "a9d180e8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "p-value = 0.0\n",
      "actual = 101.50141482893264\n",
      "ts max = 26.34915336197512\n"
     ]
    }
   ],
   "source": [
    "data = firsts.prglngth.values, others.prglngth.values\n",
    "ht = PregLengthTest(data)\n",
    "p_value = ht.PValue()\n",
    "print('p-value =', p_value)\n",
    "print('actual =', ht.actual)\n",
    "print('ts max =', ht.MaxTestStat())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eb3ba9dc",
   "metadata": {},
   "source": [
    "`TestStatistic` computes the chi-squared statistic for first babies and\n",
    "others, and adds them.\n",
    "\n",
    "`ChiSquared` takes a sequence of pregnancy lengths, computes its\n",
    "histogram, and computes `observed`, which is a list of frequencies\n",
    "corresponding to `self.values`. To compute the list of expected\n",
    "frequencies, it multiplies the pre-computed probabilities,\n",
    "`expected_probs`, by the sample size. It returns the chi-squared\n",
    "statistic, `stat`.\n",
    "\n",
    "For the NSFG data the total chi-squared statistic is 102, which doesn't\n",
    "mean much by itself. But after 1000 iterations, the largest test\n",
    "statistic generated under the null hypothesis is 32. We conclude that\n",
    "the observed chi-squared statistic is unlikely under the null\n",
    "hypothesis, so the apparent effect is statistically significant.\n",
    "\n",
    "This example demonstrates a limitation of chi-squared tests: they\n",
    "indicate that there is a difference between the two groups, but they\n",
    "don't say anything specific about what the difference is."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ec5893c6",
   "metadata": {},
   "source": [
    "## Errors\n",
    "\n",
    "In classical hypothesis testing, an effect is considered statistically\n",
    "significant if the p-value is below some threshold, commonly 5%. This\n",
    "procedure raises two questions:\n",
    "\n",
    "-   If the effect is actually due to chance, what is the probability\n",
    "    that we will wrongly consider it significant? This probability is\n",
    "    the **false positive rate**.\n",
    "\n",
    "-   If the effect is real, what is the chance that the hypothesis test\n",
    "    will fail? This probability is the **false negative rate**.\n",
    "\n",
    "The false positive rate is relatively easy to compute: if the threshold\n",
    "is 5%, the false positive rate is 5%. Here's why:\n",
    "\n",
    "-   If there is no real effect, the null hypothesis is true, so we can\n",
    "    compute the distribution of the test statistic by simulating the\n",
    "    null hypothesis. Call this distribution $\\CDF_T$.\n",
    "\n",
    "-   Each time we run an experiment, we get a test statistic, $t$, which\n",
    "    is drawn from $CDF_T$. Then we compute a p-value, which is the\n",
    "    probability that a random value from $CDF_T$ exceeds `t`, so that's\n",
    "    $1 - CDF_T(t)$.\n",
    "\n",
    "-   The p-value is less than 5% if $CDF_T(t)$ is greater than 95%; that\n",
    "    is, if $t$ exceeds the 95th percentile. And how often does a value\n",
    "    chosen from $CDF_T$ exceed the 95th percentile? 5% of the time.\n",
    "\n",
    "So if you perform one hypothesis test with a 5% threshold, you expect a\n",
    "false positive 1 time in 20."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c8789189",
   "metadata": {},
   "source": [
    "## Power\n",
    "\n",
    "The false negative rate is harder to compute because it depends on the\n",
    "actual effect size, and normally we don't know that. One option is to\n",
    "compute a rate conditioned on a hypothetical effect size.\n",
    "\n",
    "For example, if we assume that the observed difference between groups is\n",
    "accurate, we can use the observed samples as a model of the population\n",
    "and run hypothesis tests with simulated data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "6692e494",
   "metadata": {},
   "outputs": [],
   "source": [
    "def FalseNegRate(data, num_runs=100):\n",
    "    group1, group2 = data\n",
    "    count = 0\n",
    "\n",
    "    for i in range(num_runs):\n",
    "        sample1 = thinkstats2.Resample(group1)\n",
    "        sample2 = thinkstats2.Resample(group2)\n",
    "\n",
    "        ht = DiffMeansPermute((sample1, sample2))\n",
    "        pvalue = ht.PValue(iters=101)\n",
    "        if pvalue > 0.05:\n",
    "            count += 1\n",
    "\n",
    "    return count / num_runs"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e20ffd80",
   "metadata": {},
   "source": [
    "`FalseNegRate` takes data in the form of two sequences, one for each\n",
    "group. Each time through the loop, it simulates an experiment by drawing\n",
    "a random sample from each group and running a hypothesis test. Then it\n",
    "checks the result and counts the number of false negatives.\n",
    "\n",
    "`Resample` takes a sequence and draws a sample with the same length,\n",
    "with replacement:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "29d63681",
   "metadata": {},
   "outputs": [],
   "source": [
    "def Resample(xs):\n",
    "    return np.random.choice(xs, len(xs), replace=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4fc4a5e8",
   "metadata": {},
   "source": [
    "Here's the code that tests pregnancy lengths:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "4ff48293",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.7"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "live, firsts, others = first.MakeFrames()\n",
    "data = firsts.prglngth.values, others.prglngth.values\n",
    "neg_rate = FalseNegRate(data)\n",
    "neg_rate"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7dade836",
   "metadata": {},
   "source": [
    "The result is about 70%, which means that if the actual difference in\n",
    "mean pregnancy length is 0.078 weeks, we expect an experiment with this\n",
    "sample size to yield a negative test 70% of the time.\n",
    "\n",
    "This result is often presented the other way around: if the actual\n",
    "difference is 0.078 weeks, we should expect a positive test only 30% of\n",
    "the time. This \"correct positive rate\" is called the **power** of the\n",
    "test, or sometimes \"sensitivity\". It reflects the ability of the test to\n",
    "detect an effect of a given size.\n",
    "\n",
    "In this example, the test had only a 30% chance of yielding a positive\n",
    "result (again, assuming that the difference is 0.078 weeks). As a rule\n",
    "of thumb, a power of 80% is considered acceptable, so we would say that\n",
    "this test was \"underpowered.\"\n",
    "\n",
    "In general a negative hypothesis test does not imply that there is no\n",
    "difference between the groups; instead it suggests that if there is a\n",
    "difference, it is too small to detect with this sample size."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "89c935af",
   "metadata": {},
   "source": [
    "## Replication\n",
    "\n",
    "The hypothesis testing process I demonstrated in this chapter is not,\n",
    "strictly speaking, good practice.\n",
    "\n",
    "First, I performed multiple tests. If you run one hypothesis test, the\n",
    "chance of a false positive is about 1 in 20, which might be acceptable.\n",
    "But if you run 20 tests, you should expect at least one false positive,\n",
    "most of the time.\n",
    "\n",
    "Second, I used the same dataset for exploration and testing. If you\n",
    "explore a large dataset, find a surprising effect, and then test whether\n",
    "it is significant, you have a good chance of generating a false\n",
    "positive.\n",
    "\n",
    "To compensate for multiple tests, you can adjust the p-value threshold\n",
    "(see <https://en.wikipedia.org/wiki/Holm-Bonferroni_method>). Or you can\n",
    "address both problems by partitioning the data, using one set for\n",
    "exploration and the other for testing."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "729695dc",
   "metadata": {},
   "source": [
    "In some fields these practices are required or at least encouraged. But\n",
    "it is also common to address these problems implicitly by replicating\n",
    "published results. Typically the first paper to report a new result is\n",
    "considered exploratory. Subsequent papers that replicate the result with\n",
    "new data are considered confirmatory.\n",
    "\n",
    "As it happens, we have an opportunity to replicate the results in this\n",
    "chapter. The first edition of this book is based on Cycle 6 of the NSFG,\n",
    "which was released in 2002. In October 2011, the CDC released additional\n",
    "data based on interviews conducted from 2006--2010. `nsfg2.py` contains\n",
    "code to read and clean this data. In the new dataset:\n",
    "\n",
    "-   The difference in mean pregnancy length is 0.16 weeks and\n",
    "    statistically significant with $p < 0.001$ (compared to 0.078 weeks\n",
    "    in the original dataset).\n",
    "\n",
    "-   The difference in birth weight is 0.17 pounds with $p < 0.001$\n",
    "    (compared to 0.12 lbs in the original dataset).\n",
    "\n",
    "-   The correlation between birth weight and mother's age is 0.08 with\n",
    "    $p < 0.001$ (compared to 0.07).\n",
    "\n",
    "-   The chi-squared test is statistically significant with $p < 0.001$\n",
    "    (as it was in the original).\n",
    "\n",
    "In summary, all of the effects that were statistically significant in\n",
    "the original dataset were replicated in the new dataset, and the\n",
    "difference in pregnancy length, which was not significant in the\n",
    "original, is bigger in the new dataset and significant."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f1ff960",
   "metadata": {},
   "source": [
    "## Glossary\n",
    "\n",
    "-   **hypothesis testing**: The process of determining whether an\n",
    "    apparent effect is statistically significant.\n",
    "\n",
    "-   **test statistic**: A statistic used to quantify an effect size.\n",
    "\n",
    "-   **null hypothesis**: A model of a system based on the assumption\n",
    "    that an apparent effect is due to chance.\n",
    "\n",
    "-   **p-value**: The probability that an effect could occur by chance.\n",
    "\n",
    "-   **statistically significant**: An effect is statistically\n",
    "    significant if it is unlikely to occur by chance.\n",
    "\n",
    "-   **permutation test**: A way to compute p-values by generating\n",
    "    permutations of an observed dataset.\n",
    "\n",
    "-   **resampling test**: A way to compute p-values by generating\n",
    "    samples, with replacement, from an observed dataset.\n",
    "\n",
    "-   **two-sided test**: A test that asks, \"What is the chance of an\n",
    "    effect as big as the observed effect, positive or negative?\"\n",
    "\n",
    "-   **one-sided test**: A test that asks, \"What is the chance of an\n",
    "    effect as big as the observed effect, and with the same sign?\"\n",
    "\n",
    "-   **chi-squared test**: A test that uses the chi-squared statistic as\n",
    "    the test statistic.\n",
    "\n",
    "-   **false positive**: The conclusion that an effect is real when it is\n",
    "    not.\n",
    "\n",
    "-   **false negative**: The conclusion that an effect is due to chance\n",
    "    when it is not.\n",
    "\n",
    "-   **power**: The probability of a positive test if the null hypothesis\n",
    "    is false.\n",
    "\n",
    "[^1]: For more about Bayesian inference, see the sequel to this book,\n",
    "    *Think Bayes*.\n",
    "\n",
    "[^2]: Adapted from MacKay, *Information Theory, Inference, and Learning\n",
    "    Algorithms*, 2003."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "940c93d6",
   "metadata": {},
   "source": [
    "## Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c31607e8",
   "metadata": {},
   "source": [
    "**Exercise:** As sample size increases, the power of a hypothesis test increases, which means it is more likely to be positive if the effect is real. Conversely, as sample size decreases, the test is less likely to be positive even if the effect is real.\n",
    "\n",
    "To investigate this behavior, run the tests in this chapter with different subsets of the NSFG data. You can use `thinkstats2.SampleRows` to select a random subset of the rows in a DataFrame.\n",
    "\n",
    "What happens to the p-values of these tests as sample size decreases? What is the smallest sample size that yields a positive test?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "070d2b33",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "def RunTests(live, iters=1000):\n",
    "    \"\"\"Runs the tests from Chapter 9 with a subset of the data.\n",
    "\n",
    "    live: DataFrame\n",
    "    iters: how many iterations to run\n",
    "    \"\"\"\n",
    "    n = len(live)\n",
    "    firsts = live[live.birthord == 1]\n",
    "    others = live[live.birthord != 1]\n",
    "\n",
    "    # compare pregnancy lengths\n",
    "    data = firsts.prglngth.values, others.prglngth.values\n",
    "    ht = DiffMeansPermute(data)\n",
    "    p1 = ht.PValue(iters=iters)\n",
    "\n",
    "    data = (firsts.totalwgt_lb.dropna().values,\n",
    "            others.totalwgt_lb.dropna().values)\n",
    "    ht = DiffMeansPermute(data)\n",
    "    p2 = ht.PValue(iters=iters)\n",
    "\n",
    "    # test correlation\n",
    "    live2 = live.dropna(subset=['agepreg', 'totalwgt_lb'])\n",
    "    data = live2.agepreg.values, live2.totalwgt_lb.values\n",
    "    ht = CorrelationPermute(data)\n",
    "    p3 = ht.PValue(iters=iters)\n",
    "\n",
    "    # compare pregnancy lengths (chi-squared)\n",
    "    data = firsts.prglngth.values, others.prglngth.values\n",
    "    ht = PregLengthTest(data)\n",
    "    p4 = ht.PValue(iters=iters)\n",
    "\n",
    "    print('%d\\t%0.2f\\t%0.2f\\t%0.2f\\t%0.2f' % (n, p1, p2, p3, p4))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "7ab7fb0c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "9148\t0.19\t0.00\t0.00\t0.00\n",
      "4574\t0.18\t0.00\t0.00\t0.00\n",
      "2287\t0.65\t0.01\t0.08\t0.00\n",
      "1143\t0.60\t0.52\t0.01\t0.11\n",
      "571\t0.51\t0.53\t0.00\t0.00\n",
      "285\t0.59\t0.08\t0.00\t0.22\n",
      "142\t0.34\t0.19\t0.62\t0.21\n"
     ]
    }
   ],
   "source": [
    "# Solution\n",
    "\n",
    "n = len(live)\n",
    "for _ in range(7):\n",
    "    sample = thinkstats2.SampleRows(live, n)\n",
    "    RunTests(sample)\n",
    "    n //= 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "303ef659",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "# My results:\n",
    "\n",
    "# test1: difference in mean pregnancy length\n",
    "# test2: difference in mean birth weight\n",
    "# test3: correlation of mother's age and birth weight\n",
    "# test4: chi-square test of pregnancy length\n",
    "\n",
    "# n       test1   test2   test2   test4\n",
    "# 9148\t0.16\t0.00\t0.00\t0.00\n",
    "# 4574\t0.10\t0.01\t0.00\t0.00\n",
    "# 2287\t0.25\t0.06\t0.00\t0.00\n",
    "# 1143\t0.24\t0.03\t0.39\t0.03\n",
    "# 571\t0.81\t0.00\t0.04\t0.04\n",
    "# 285\t0.57\t0.41\t0.48\t0.83\n",
    "# 142\t0.45\t0.08\t0.60\t0.04\n",
    "\n",
    "# Conclusion: As expected, tests that are positive with large sample\n",
    "# sizes become negative as we take away data.  But the pattern is\n",
    "# erratic, with some positive tests even at small sample sizes.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "91f4df49",
   "metadata": {},
   "source": [
    "**Exercise:** In Section 9.3, we simulated the null hypothesis by permutation; that is, we treated the observed values as if they represented the entire population, and randomly assigned the members of the population to the two groups.\n",
    "\n",
    "An alternative is to use the sample to estimate the distribution for the population, then draw a random sample from that distribution. This process is called resampling. There are several ways to implement resampling, but one of the simplest is to draw a sample with replacement from the observed values, as in Section 9.10.\n",
    "\n",
    "Write a class named `DiffMeansResample` that inherits from `DiffMeansPermute` and overrides `RunModel` to implement resampling, rather than permutation.\n",
    "\n",
    "Use this model to test the differences in pregnancy length and birth weight. How much does the model affect the results?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "aac7aaab",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "class DiffMeansResample(DiffMeansPermute):\n",
    "    \"\"\"Tests a difference in means using resampling.\"\"\"\n",
    "    \n",
    "    def RunModel(self):\n",
    "        \"\"\"Run the model of the null hypothesis.\n",
    "\n",
    "        returns: simulated data\n",
    "        \"\"\"\n",
    "        group1 = np.random.choice(self.pool, self.n, replace=True)\n",
    "        group2 = np.random.choice(self.pool, self.m, replace=True)\n",
    "        return group1, group2\n",
    "  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "241fe939",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "def RunResampleTest(firsts, others):\n",
    "    \"\"\"Tests differences in means by resampling.\n",
    "\n",
    "    firsts: DataFrame\n",
    "    others: DataFrame\n",
    "    \"\"\"\n",
    "    data = firsts.prglngth.values, others.prglngth.values\n",
    "    ht = DiffMeansResample(data)\n",
    "    p_value = ht.PValue(iters=10000)\n",
    "    print('\\ndiff means resample preglength')\n",
    "    print('p-value =', p_value)\n",
    "    print('actual =', ht.actual)\n",
    "    print('ts max =', ht.MaxTestStat())\n",
    "\n",
    "    data = (firsts.totalwgt_lb.dropna().values,\n",
    "            others.totalwgt_lb.dropna().values)\n",
    "    ht = DiffMeansPermute(data)\n",
    "    p_value = ht.PValue(iters=10000)\n",
    "    print('\\ndiff means resample birthweight')\n",
    "    print('p-value =', p_value)\n",
    "    print('actual =', ht.actual)\n",
    "    print('ts max =', ht.MaxTestStat())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "9c08df55",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "diff means resample preglength\n",
      "p-value = 0.1721\n",
      "actual = 0.07803726677754952\n",
      "ts max = 0.2261711162972233\n",
      "\n",
      "diff means resample birthweight\n",
      "p-value = 0.0001\n",
      "actual = 0.12476118453549034\n",
      "ts max = 0.12764183747384727\n"
     ]
    }
   ],
   "source": [
    "# Solution\n",
    "\n",
    "RunResampleTest(firsts, others)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "d8820f9d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "# Conclusions: Using resampling instead of permutation has very\n",
    "# little effect on the results.\n",
    "\n",
    "# The two models are based on slightly difference assumptions, and in\n",
    "# this example there is no compelling reason to choose one or the other.\n",
    "# But in general p-values depend on the choice of the null hypothesis;\n",
    "# different models can yield very different results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "08891d42",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
